In this part of the course, we will cover the following concepts:
This course focuses on a kind of advanced classification method called a decision tree
Every decision tree starts with a specific decision called the root node
The root and leaf nodes hold questions you have to answer
Branches are lines that connect the nodes
Which number on the diagram corresponds to the leaf node?
Share your response in the chat box
How about the root node?
Share your response in the chat box
Finally, what is the name of number 2?
Share your response in the chat box
Decision Trees are one of the supervised machine learning models used to perform Classification
| Question | Example |
|---|---|
| What is this object like? | Selecting similar medicines with similar purposes |
| Who is this person like? | Anticipating behavior or preferences of a person based on her similarities with others |
| What category is this in? | Anticipating if your patient is high risk, has an illness, will develop symptoms, etc. |
| What is the probability that something is in a given category? | Determining the probability that a drug is in a particular category; determining the probability that someone will contract an illness |
| Objectives | Complete |
|---|---|
| Discuss use cases for Decision Trees | |
| Summarize the concepts and math behind Decision Trees |
| Complete | |
|---|---|
| Discuss use cases for Decision Trees |
✔ |
| Summarize the concepts and math behind Decision Trees |
Which question is more important on a date?
The most relevant question
When should you stop asking questions?
When the answer no longer provides additional relevant information
How do we decide which node to split and how to split it?
There are two impurity functions that are most commonly used with tree-based models
The sklearn.tree algorithm uses Gini, so this is the method we will focus on in this module
Gini measures the probability of misclassification in the model for each branch of a decision tree. Gini ranges from 0 to 1.
\[Gini(E) = 1 - \sum_{j=1}^{c}p_j^2 \]
SO far, we learned about growing the tree and making the choice on how to proceed regarding step 2. In order to get to step 3 and 4, we will need to work with a dataset next.
| Objectives | Complete |
|---|---|
| Discuss use cases for Decision Trees |
✔ |
| Summarize the concepts and math behind Decision Trees |
✔ |